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A Lagrangian generalization of time-reversible Born-Oppenheimer molecular dynamics [Niklasson 
et at, Phys. Rev. Lett. 97, 123001 (2006)] is proposed. The Lagrangian includes extended 
electronic degrees of freedom as auxiliary dynamical variables in addition to the nuclear coordinates 
and momenta. While the nuclear degrees of freedom propagate on the Born-Oppenheimer potential 
energy surface, the extended auxiliary electronic degrees of freedom evolve as a harmonic oscillator 
centered around the adiabatic propagation of the self-consistent ground state. The formulation 
enables the application of higher-order symplectic or geometric integration schemes that are stable 
and energy conserving even under incomplete self-consistency convergence. It is demonstrated how 
the extended Born-Oppenheimer molecular dynamics improves the accuracy by over an order of 
magnitude compared to previous formulations at the same level of computational cost. 

PACS numbers: 71.15.Pd,31.15.Ew,31.15.Qg,34.10.-|-x 
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Born-Oppenheimer molecular dynamics [U, [3, H, 0, S @| 
based on self-consistent field (SCF) methods, such as 
Hartree-Fock or density functional theory 0, H is 
currently a gold standard in molecular dynamics sim- 
ulations. It is derived from the well defined adiabatic 
approximation for the separation of the nuclear and 
electronic degrees of freedom, where the forces acting 
on the atoms are calculated at the self-consistent elec- 
tronic ground state [l^. However, the ability to achieve 
physically accurate and stable microcanonical simula- 
tions, while keeping the computational cost low, has 
been limited by the requirement of a high degree of SCF 
convergence in the nuclear force calculations [ll|j [13] ■ 
Only very recently, by restoring the time-reversal sym- 
metry in the underlying adiabatic propagation of the 
electronic degrees of freedom [ij, Hj, HBl) has it been 
possible to achieve efficient energy conserving simula- 
tions also under incomplete SCF convergence. Unfor- 
tunately, these techniques are not able to take advantage 
of powerful symplectic or geometric integration methods 
developed for celestial and classical molecular dynamics 
[H [13, [H, [H H Symplectic integration, which in 
general requires a Hamiltonian or Lagrangian formula- 
tion of the dynamics, enables highly efficient simulations 
while keeping a rigorous control over physical properties. 

The purpose of this Letter is to take advantage of geo- 
metric integration methods in Born-Oppenheimer molec- 
ular dynamics by introducing a Lagrangian general- 
ization of the recently proposed time-reversible Born- 
Oppenheimer molecular dynamics p^ . This gives time- 
reversible Born-Oppenheimer molecular dynamics a the- 
oretically more solid and physically transparent frame- 
work and, most significantly, thanks to the Lagrangian 
formulation, higher-order symplectic integration algo- 
rithms can be applied, which provide superior perfor- 
mance in molecular dynamics simulations. It will be 
demonstrated how the accuracy of the extended Born- 
Oppenheimer molecular dynamics is increased by over 



an order of magnitude at the same level of computational 
cost compared to previous formulations. 

The conventional Born-Oppenheimer (BO) Lagrangian 
for ab initio molecular dynamics is given by 



£^°(R,R) 



(1) 



where the potential C/scf is the total electronic energy 
in, for example, Hartree-Fock or density functional the- 
ory, including the nuclear- nuclear repulsion terms. The 
potential energy J7scf[R-;£'] is defined at the electronic 
ground state given by the SCF optimized solution D of 
the electronic degrees of freedom. D is assumed to be 
the symmetric single-particle density matrix in an or- 
thogonal basis-set representation, though generalizations 
to other representations, such as the density, the wave- 
functions, or the Kohn-Sham Hamiltonian, are straight- 
forward. Notice that D is not an independent dynamical 
variable, since it is determined by the external poten- 
tial at atomic configuration R — {Rk\- It is included to 
show that the Lagrangian is calculated at the self- 
consistent Born-Oppenheimer ground state. The nuclear 
degrees of freedom are given by the atomic coordinates 
Rk and velocities Rk^ with the corresponding masses Mk- 
The dots denote time derivatives. 

The key result of this Letter is the extension of the 
Born-Oppenheimer Lagrangian £^0 in Eq. ^ by the 
addition of auxiliary electronic degrees of freedom P and 
P that evolve in a harmonic potential centered around 
the self-consistent solution D. The extended auxiliary 
dynamical variables, P and P, are assumed to be of 
the same form as I?, i.e. a density matrix and its time 
derivative. The extended Born-Oppenheimer (XBO) La- 
grangian is given by 



/:^""(R,R,P,P) = C 



BO 



Tr[P^ 
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Here fi and u are fictitious mass and frequency parame- 
ters for the auxihary electronic degrees of freedom. There 
are no additional terms imposing constraints on the elec- 
tronic degrees of freedom, i.e. wavefunction orthogonal- 
ity or density matrix idempotency [13, [H, HI]- These 
constraints are not necessary, since the potential energy 
UscF [R; D] and the nuclear forces are calculated at the 
normalized and idempotent ground state D. 

The time evolution of the dynamical system described 
by the extended Lagrangian C'^^'-' is determined by 
Euler-Lagrange equations of motion: 

MkRk ^ -^^^^^iP^-^^u;'Tr[{D-P)^D/^Rk], (3) 



^iP = nLo'^{D - P). 



(4) 



In the limit /i ^ 0, i.e. when C^^^i _^ C^O ^ ^^le dynam- 
ics is determined by the equations of motion. 



MkRk 



dUscF [R; D] 
dRk 



P = LO^{D - P). 



(5) 



(6) 



These two equations of motion reveal two properties 
that are of fundamental importance in the extended 
Born-Oppenheimer molecular dynamics: a) The nuclear 
forces are calculated at the self-consistent ground state 
D as with the Born-Oppenheimer Lagrangian. The 
molecular trajectories therefore evolve on the Born- 
Oppenheimer potential energy surface with the to- 
tal Born-Oppenheimer energy, E^'~' = ^ MkR-l + 
C/scf[R;£'], as a constant of motion, b) The equations 
of motion do not include the fictitious electron mass pa- 
rameter fi, which otherwise could cause problems [25]. 

Since the extended electronic degrees of freedom P{t) 
evolve in a harmonic potential centered around their own 
self-consistent solutions D{t) in Eq. ([2]), the auxiliary 
density matrix P{t) and its self-consistent solution D{t) 
will stay close together. We can therefore use P{t) as an 
efficient initial guess to D{t) in the iterative SCF opti- 
mization, 



D{t) ^ SCF[R(t),P(i)]. 



(7) 



This strongly reduces the computational cost to reach 
the self-consistent ground state D{t) at which the nuclear 
forces are calculated in Eq. ([5]). 

In conventional Born-Oppenheimer molecular dynam- 
ics the initial guess for the iterative SCF optimization is 
not given by an auxiliary dynamical variable, as in Eq. 

gljbut b y a n extrapolation from previous time steps 
, [nl, E is 113, Hi- Unfortunately, because the SCF 
procedure is irreversible and in practice never complete, 
this extrapolation breaks the time-reversal symmetry in 
the underlying propagation of the electronic degrees of 
freedom, which causes serious stability problems with 



a systematic drift in the total energy [Tl|, Only 
by increasing the SCF convergence, at great computa- 
tional cost, can the energy drift be reduced, though it 
never fully disappears. The fundamental problem with 
the broken time-reversal symmetry in the electron prop- 
agation was recently solved by the introduction of time- 
reversible Born-Oppenheimer molecular dynamics based 
on a lossless dual filter integration scheme [13] ■ With 
the extended Lagrangian formulation the time-reversal 
problem is avoided in a similar way: the auxiliary elec- 
tronic degrees of freedom P{t), and thus the initial SCF 
guesses in Eq. ([7]), occur, not through extrapolation, but 
as dynamical variables that can be integrated by time- 
reversible algorithms [H, H^, 113] ■ The nuclear forces are 
then calculated with an underlying electron propagation 
that is time reversible. In this way a systematic energy 
drift is avoided even under incomplete SCF convergence. 

It is easy to see that the extended Lagrangian for- 
mulation is a generalization of time-reversible Born- 
Oppenheimer molecular d yna mics. If we apply the time- 
reversible Verlet scheme [2^ to the integration of the 
electronic degrees of freedom in Eq. ^ we get 

P{t + 5t) = 2P{t) - P{t - 6t) + St^LJ^ {D{t) - P{t)) . (8) 



If we choose the dimensionless factor k — 6t^ 



= 2 



this propagation is identical to the linear integration 
scheme in time-reversible Born-Oppenheimer molecular 
dynamics [l^ . Thus, the extended Born-Oppenheimer 
Lagrangian in Eq. ^ forms a natural framework for 
time-reversible Born-Oppenheimer molecular dynamics, 
which thereby is given a more rigorous and physically 
transparent formulation. Instead of propagation through 
a time-reversible dual filter process, the auxiliary elec- 
tronic degrees of freedom P{t) occur as dynamical vari- 
ables that evolve through a time-reversible integration 
scheme. 

Possibly the most important advantage of the ex- 
tended Lagrangian formulation of time-reversible Born- 
Oppenheimer molecular dynamics is that it enables the 
application of higher-order symplectic or geometric in- 
tegration methods [H, [13, [11, [il US mi- Whereas a 
conventional integration algorithm can be seen as a nu- 
merical approximation for the integration of an under- 
lying exact Hamiltonian dynamics, a symplectic integra- 
tion can be seen as an exact integration for an underlying 
approximate Hamiltonian. The conservation of various 
physical properties of the ap pro ximate Hamiltonian can 
then be rigorously controlled (ITI . [20| . [sij . For the nuclear 
coordinates in Eq. ([5]) a quite general symplectic integra- 
tion ijS, [Ml over a time length 5t is divided in m steps 
(i = 1,2, . . .,m) 



Rk{U) — Rk{ti-i) + biStRk{ti-i), 
Rk{ti) = RkiU^i) + a^StRkiU). 



(9) 



Here [Rk{to), Rk{to)] - [Rk{t), Rk{t)] and [Rk{t + 
St), Rk{t + St)] — [Rk{tm), Rk{tm)]- For the electronic 
degrees of freedom in Eq. ([5]), for i = l,2,...,m, and 



using the variable substitution 5tP{t) — > P{t), the sym- 
plectic integration is 



P{U) = P{u 

P{U) = P{U 



-l)+hK{D(t, 

-i)+a^P{U), 



i)-P{U-i)), 



(10) 



where D{U) = SCF[R(t,), Here k = dt'^uj'^, 

[P{to),Pito)] = [Pit). Pit)] and [Pit + St), Pit + St)] = 
[Pitm),Pitm)]- Examples of coefficients ai and bi for 
various number of steps m can be found in Ref . [3l| . 

Before applying a symplectic integration algorithm the 
value of the dimensionless constant k = St^LiP must be 
chosen. Since the SCF convergence is always incomplete 
the self-consistent solutions Diti) will be calculated only 
approximately. The idea is to choose k such that the in- 
tegration in Eq. PH)) is always stable under approximate 
SCF convergence. The optimal choice is the largest n- 
value that is consistent with stability, since this choice 
gives the largest value of uP' for a given time step St. 
A larger corresponds to a higher curvature of the 
harmonic potential in Eq. which keeps the auxiliary 
dynamical variables closer to the self-consistent ground 
state. This reduces the error and/or the cost for the SCF 
optimization. Based on a linearization of the SCF opti- 
mization procedure in Eq. ([7]) around its exact ground 
state D* , we can express an approximately SCF opti- 
mized density matrix as 



D^D* +TiP-D*). 



(11) 



Here F corresponds to the SCF response kernel, which is 
given as a "super matrix" acting on the matrix (P — D*). 
Assuming at least some amount of convergence in the 
SCF procedure the eigenvalue of F with the largest mag- 
nitude, 7, will be somewhere in the interval 7 € [—1, 1]. 
Following the analysis by Arias et al. [2^, we insert the 
linearized SCF expression in Eq. (|lip . with F replaced 
by 7, in the symplectic integration, Eq. (jlOp . and look at 
the homogeneous part of the equation. 



Pit) 
Pit) 



T T 



.Ti 



Pit - St) 
Pit - St) 



Here = 1,2,. 



, , to) are the matrices 

1 b,Kij - 1) 
Ui aibiKi'~i - 1) + 1 



(12) 



(13) 



Equation (|12p corresponds to a mapping of the phase 
space from one time step to the next for a linearized 
test system with the constant solution D*it) = 0. The 
optimal choice of k, is the largest value for which the 
mapping T„T„_i...Ti has all its eigenvalues on the 
unit circle for all degrees of incomplete SCF convergence, 
i.e. for 7 G [—1,1]. This case avoids exponentially in- 
creasing solutions leading to numerical instabilities or 
unphysical dissipation. The mapping in Eq. p2p always 
preserves the phase space, i.e. the "area" spanned by 
Pit) and Pit), since the determinant of the mapping 
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FIG. 1: The fluctuations in total Born-Oppenheimer energy 
{E^'-' (t) — Eo) using three different ab initio molecular dynam- 
ics approaches described in the text. The integration time 
length dt is adjusted to allow for a direct comparison between 
the methods at the same level of computational cost. 



det(T,„T,„_i...Ti) = H™idet(Tj) = 1 for aU values 
of K and 7. 

For the optimal 4th order integration scheme by 
McLachlan and Atela [31} (where ai « 0.515352837, 02 ~ 
-0.0857820194, ag ~ 0.441583024, 04 ~ 0.128846158, 
&i « 0.134496199, 62 « -0.224819803,63 « 0.756320001, 
and 64 « 0.334003603) the largest possible K-value con- 
sistent with stability under incomplete SCF convergence 
is K = 4.617. For the conventional Leap-Frog or Velocity 
Verlet scheme [3l|, as well as the time- reversible Verlet 
integration in Eq. ([8]), the largest possible value is k = 2. 

The great advantage with stability under incomplete 
SCF convergence is that any amount of convergence 
suffices for stability, which typically means that only 
one SCF cycle per force calculation is necessary. This 
is in contrast to possibly all previous higher-order ex- 
trapolation schemes (beyond linear order), for example, 
the Fock-Matrix dynamics schemes by Pulay and Fog- 
arasi [l2| and the higher-order time-reversible schemes 
by Niklasson et al. [IJ], which are unstable under in- 
complete SCF convergence, i.e. for the full interval of 
7 G [—1,1]. Simulations based on those methods will 
thus diverge if not a certain finite degree of SCF conver- 
gence can be guaranteed. To take full advantage of the 
higher-order symplectic integration schemes the accuracy 
in the electronic force calculations in Eqs. © and pU]) 
should preferably match the accuracy in the integration 
of the nuclear degrees of freedom. This generally moti- 
vates an improved SCF convergence, though it is not a 
requirement for stability. 

Figure [1] shows the fluctuations in the total Born- 
Oppenheimer energy E^'-' for a C2F4 molecule using 
Hartree-Fock theory with a standard Gaussian basis set 
for three schemes: i) conventional linear interpolation 
of the electronic degrees of freedom from two previous 
time steps, ii) time-reversible Born-Oppenheimer molec- 
ular dynamics, Eq. ([5]) and iii) the proposed sym- 
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TABLE I: Estimated total energy fluctuation amplitudes for 
the linear time-reversible (A) |l4j and the optimized 4th or- 
der symplectic (B) integration in Eqs. (|9]) and (|10|) ;31i] . (3 
SCF/force calculation) 

System A {St = 0.5 fs) B {5t = 2.0 fs) 

(H20)io (RHF/3-21G) 50 ^Hartree 2 /jHartree 
C2F4 (RHF/3-21G) 80 ^Hartree 2 /xHartree 
Fa (RHF/6-31G) 20 /xHartree 0.07 /xHartree 

plectic integration scheme, Eqs. (O and (fTU)) . with the 
optimal 4th order coefficients by McLachlan and Atela 
[3lj, with K = 4.617. In scheme i) and ii) the nuclear co- 
ordinates are integrated by the velocity- Verlet algorithm 
[30l |. Three SCF fixed point iterations without mixing 
were used in each force calculation. This example clearly 
illustrates three levels of performance. The linear inter- 
polation scheme has a rapid decay, whereas no system- 
atic energy drift is seen for the time-reversible and the 
symplectic integration schemes. Most importantly, the 
higher-order symplectic approach improves the numer- 
ical accuracy, as measured by the amplitude of the en- 
ergy fluctuations, by over an order of magnitude ('^ 1 /40) 
compared to the previous time-reversible scheme, at the 
same level of computational cost. Similar improvements 
are found in Tab. HI 

In summary, a Lagrangian formulation of time- 



reversible Born-Oppenheimer molecular dynamics was 
proposed, where extended auxiliary electronic degrees of 
freedom evolve as a harmonic oscillator centered around 
the adiabatic propagation of the self-consistent electronic 
ground state. The Lagrangian formulation enables the 
application of highly efficient and accurate symplectic or 
geometric integration methods that are shown to be sta- 
ble under incomplete SCF convergence. For example, 
using a 4th-order symplectic integration scheme it was 
demonstrated how the accuracy is improved by over an 
order of magnitude compared to previous formulations. 
With a Lagrangian formulation it is also possible to ex- 
tend microcanonical simulations to other ensembles, for 
example, to Nose-Hoover thermostats or Langevin dy- 
namics. The extended Born-Oppenheimer molecular dy- 
namics presented here, which is based on a density ma- 
trix representation, may thus open the door to alterna- 
tive and even more versatile formulations of molecular 
dynamics. 
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